Prediction

Predictions

Predictor Classes

Decoding Strategies

eole.predict.greedy_search.sample_with_temperature(logits, temperature, top_k, top_p)

Select next tokens randomly from the top k possible next tokens.

Samples from a categorical distribution over the top_k words using the category probabilities logits / temperature.

Parameters:
- logits (FloatTensor) – Shaped (batch_size, vocab_size). These can be logits ((-inf, inf)) or log-probs ((-inf, 0]). (The distribution actually uses the log-probabilities logits - logits.logsumexp(-1), which equals the logits if they are log-probabilities summing to 1.)
- temperature (float) – Used to scale down logits. The higher the value, the more likely it is that a non-max word will be sampled.
- top_k (int) – This many words could potentially be chosen. The other logits are set to have probability 0.
- top_p (float) – Keep most likely words until the cumulated probability is greater than p. If used with top_k: both conditions will be applied
Returns:
- topk_ids: Shaped (batch_size, 1). These are the sampled word indices in the output vocab.
- topk_scores: Shaped (batch_size, 1). These are essentially (logits / temperature)[topk_ids].
Return type: (LongTensor, FloatTensor)

Scoring

class eole.predict.penalties.PenaltyBuilder(cov_pen, length_pen)

Bases: object

Returns the Length and Coverage Penalty function for Beam Search.

Parameters:
- length_pen (str) – option name of length pen
- cov_pen (str) – option name of cov pen
Variables:
- has_cov_pen (bool) – Whether coverage penalty is None (applying it is a no-op). Note that the converse isn’t true. Setting beta to 0 should force coverage length to be a no-op.
- has_len_pen (bool) – Whether length penalty is None (applying it is a no-op). Note that the converse isn’t true. Setting alpha to 1 should force length penalty to be a no-op.
- coverage_penalty (callable [ *[*FloatTensor , float ] , FloatTensor ]) – Calculates the coverage penalty.
- length_penalty (callable [ *[*int , float ] , float ]) – Calculates the length penalty.

coverage_none(cov, beta=0.0)

Returns zero as penalty

coverage_summary(cov, beta=0.0)

Our summary penalty.

coverage_wu(cov, beta=0.0)

GNMT coverage re-ranking score.

See “Google’s Neural Machine Translation System” []. cov is expected to be sized (*, seq_len), where * is probably batch_size x beam_size but could be several dimensions like (batch_size, beam_size). If cov is attention, then the seq_len axis probably sums to (almost) 1.

length_average(cur_len, alpha=1.0)

Returns the current sequence length.

length_none(cur_len, alpha=0.0)

Returns unmodified scores.

length_wu(cur_len, alpha=0.0)

GNMT length re-ranking score.

See “Google’s Neural Machine Translation System” [].

Prediction

Predictions​

Predictor Classes​

Decoding Strategies​

eole.predict.greedy_search.sample_with_temperature(logits, temperature, top_k, top_p)​

Scoring​

class eole.predict.penalties.PenaltyBuilder(cov_pen, length_pen)​

coverage_none(cov, beta=0.0)​

coverage_summary(cov, beta=0.0)​

coverage_wu(cov, beta=0.0)​

length_average(cur_len, alpha=1.0)​

length_none(cur_len, alpha=0.0)​

length_wu(cur_len, alpha=0.0)​